Topic representation: Finding more representative words in topic models

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Finding Topic Words for Hierarchical Summarization (DRAFT)

! "$#% & ' ( *) ' ,+! ./. *0 ( *) ' ,+ 1"2 ! 3) / 4#% . *) ' ,5768 2) :91 9 :;< =" ?>1 ! ./. ( *) @ ) ./ A #B 94 ' C) D ' 1 ' E./ F"4 1"E ! =) G" > H) ' E) I49 = / ?;J) LKM N#% N 4) 0 . *) HOE *) ) '9 = FO2 9 9 HO4 / 9 0 ) ?) B ' H) .E+M;: < 9 9 I4 . ) ' #P) :QN . 0 ) SR4 )DTU ' .E5WV< X H) .Y Z/ M) HO[ *) ) ? ./ \ " &) = L 1 ]./ F" ^5U_X N './9 3) N ; ) ? KM `) a94 F E./ ?) F" 29 9 ' ! "[#% b ...

متن کامل

The Use of Topic Representative Words in Text Categorization

We present a novel way to identify the representative words that are able to capture the topic of documents for use in text categorization. Our intuition is that not all word n-grams equally represent the topic of a document, and thus using all of them can potentially dilute the feature space. Hence, our aim is to investigate methods for identifying good indexing words, and empirically evaluate...

متن کامل

Topic Models with Logical Constraints on Words

This paper describes a simple method to achieve logical constraints on words for topic models based on a recently developed topic modeling framework with Dirichlet forest priors (LDA-DF). Logical constraints mean logical expressions of pairwise constraints, Must-links and Cannot-Links, used in the literature of constrained clustering. Our method can not only cover the original constraints of th...

متن کامل

Building Topic Models Based on Anchor Words

Suppose you were given a stack of documents, such as all of the articles published in a particular newspaper, and your goal was to make sense of this data, to determine topics that this data may be made up from. To frame this as an unsupervised learning problem, suppose the documents were written in a foreign language and came from a foreign planet. By understanding topics that these documents ...

متن کامل

Topic extraction with multiple topic-words in broadcast-news speech

This paper reports on topic extraction in Japanese broadcastnews speech. We studied, using continuous speech recognition, the extraction of several topic-words from broadcast-news. A combination of multiple topic-words represents the content of the news. This is a more detailed and more flexible approach than using a single word or a single category. A topic-extraction model shows the degree of...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Pattern Recognition Letters

سال: 2019

ISSN: 0167-8655

DOI: 10.1016/j.patrec.2019.01.018